CHAPTER 16 Getting Straight Talk on Straight-Line Regression 227

But the p value for the slope is very important. Assuming α = 0.05, if it’s less than

0.05, it means that the slope of the fitted straight line is statistically significantly

different from zero. This means that the X and Y variables are statistically signifi-

cantly associated with each other. A p value greater than 0.05 would indicate that

the true slope could equal zero, and there would be no conclusive evidence for a

statistically significant association between X and Y. In Figure 18-4, the p value for

the slope is 0.0127, which means that the slope is statistically significantly differ-

ent from zero. This tells you that in your model, body weight is statistically sig-

nificantly associated with SBP.

If you want to test for a significant correlation between two variables at α = 05, you

can look at the p value for the slope of the least-squares straight line. If it’s less

than 0.05, then the X and Y variables are also statistically significantly correlated.

The p value for the significance of the slope in a straight-line regression is always

exactly the same as the p value for the correlation test of whether r is statistically

significantly different from zero, as described in Chapter 15.

Wrapping up with measures

of goodness-of-fit

The last few lines of output in Figure  16-4 contain several indicators of how

well the straight line represents the data. The following sections describe this part

of the output.

The correlation coefficient

Most straight-line regression programs provide the classic Pearson r correlation

coefficient between X and Y (see Chapter 15 for details). But the program may pro-

vide you the correlation coefficient in a roundabout way by outputting r 2 rather

than r itself. In Figure 16-4, at the bottom under Multiple R-squared, the r 2 is listed

as 0.2984. If you want Pearson r, just use Microsoft Excel or a calculator to take

square root of 0.2984 to get 0.546.

The r 2 is always positive, because square of any number is always positive. But the

correlation coefficient can be positive or negative, depending on whether the fit-

ted line slopes upward or downward. If the fitted line slopes downward, make

your r value negative.

Why did the program give you r 2 instead of r in the first place? It’s because r 2 is a

useful estimate called the coefficient of determination. It tells you what percent of

the total variability in the Y variable can be explained by the fitted line.»

» An r 2 value of 1 means that the points lie exactly on the fitted line, with no

scatter at all.